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Abstract 

The  over-reaching  goal  of  this  project  is  to  provide  the  necessary  tools  and  techniques  for 
supporting  general-purpose  secure  computation  and  outsourcing.  The  three  main  thrusts  of 
the  project  are:  (i)  development  of  efficient  techniques  for  securely  working  with  standard  data 
types,  (ii)  designing  efficient  data-oblivious  algorithms  and  data  structures  suitable  for  secure 
computation  and  outsourcing,  and  (iii)  building  a  compiler  for  translating  a  program  written 
in  a  conventional  programming  language  which  is  intended  to  handle  private  data  into  the 
corresponding  secure  distributed  implementation  that  provably  protects  private  data  throughout 
program  execution.  This  report  summarizes  the  research  findings  of  the  project  and  scientific 
advances  made  towards  each  of  the  research  thrusts  throughout  the  project  duration. 

1  Project  Objectives 

Cloud  computing  enables  convenient  on-demand  access  to  computing  and  data  storage  resources, 
which  can  be  configured  to  meet  unique  clients’  constraints  and  utilized  with  minimal  management 
overhead.  The  recent  rapid  growth  in  availability  of  cloud  services  makes  them  attractive  and 
economically  sensible  for  clients  with  limited  computing  or  storage  resources  who  are  unable  to 
procure  and  maintain  their  own  computing  infrastructure.  This  includes  numerous  applications 
in  commercial,  government,  and  military  domains,  including,  e.g.,  weak  devices  such  as  sensors 
operating  outside  the  base.  One  of  the  largest  possibilities  that  the  cloud  enables  is  computation 
outsourcing,  when  the  client  can  utilize  any  necessary  computing  resources  for  its  computational 
task.  Security  considerations,  however,  stand  on  the  way  of  harnessing  the  full  benefits  of  cloud 
computing  to  the  fullest  extent  and  prevent  clients  from  placing  their  sensitive  data  or  computations 
on  the  cloud.  This  is  of  utmost  importance  for  data  concerning  national  security,  but  even  in  non- 
military  contexts  businesses  are  also  hesitant  to  make  their  proprietary  available  to  the  cloud  [1]. 
While  in  general  sensitive  data  can  be  protected  by  the  means  of  encryption,  traditional  encryption 
is  not  suitable  for  computation  over  data.  Protection  of  the  data  in  outsourced  computation  was 
thus  set  to  be  one  of  the  main  goals  of  this  research. 

The  broad  goal  of  this  research  project  is  to  develop  techniques  suitable  for  secure  and  general 
data  processing  and  outsourcing.  The  desire  to  carry  out  computation  in  a  privacy-preserving 
manner  without  revealing  information  about  the  sensitive  inputs  throughout  the  computation  is 
not  new:  it  has  been  a  topic  of  research  since  Yao’s  seminal  work  on  secure  function  evaluation  [9]. 
However,  despite  the  cheer  volume  of  research  literature  on  privacy-preserving  computation  and 
newly  appearing  secure  outsourcing  techniques,  most  of  the  previously  available  techniques  focused 
on  rather  narrow  domains  such  as  integer-based  arithmetic,  keyword  search  over  encrypted  data,  or 
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two-party  set  operations.  Little  or  no  attention  has  been  paid  to  other  types  of  computation,  as  well 
as  to  data  structures  and  algorithms  suitable  for  secure  data  processing  in  untrusted  environments. 

Another  reason  why  secure  computation  techniques  are  not  commonly  used  in  practice  is  their 
complexity  and  overhead.  Recent  progress  in  the  performance  of  secure  multi-party  computation 
techniques,  however,  demonstrated  that  secure  computation  can  be  very  fast  (e.g.,  millions  of 
operations  per  second  performed  by  Sharemind  on  a  LAN).  This,  combined  with  the  shift  toward 
cloud  computing  and  storage,  offered  a  major  incentive  for  further  development  of  new  techniques 
for  general-purpose  secure  data  processing.  We  therefore  believed  that  it  was  the  prime  time  to 
enable  privacy-preserving  execution  of  any  functionality  or  program,  and  the  grand  goal  of  this 
research  was  to  develop  techniques  for  securely  computing  on  data  of  different  types  and  their 
collections,  including  oblivious  data  structures  and  algorithms.  Data-oblivious  execution  is  defined 
as  having  the  sequence  of  executed  instructions  and  the  sequence  of  accessed  memory  locations  to 
be  independent  of  the  input. 

Toward  that  goal,  this  research  intended  to  cover  new  techniques  for  major  data  types  and 
collections,  such  as  boolean,  integer,  and  real  values,  strings,  sets,  vectors,  and  matrices.  Further¬ 
more,  to  facilitate  the  use  of  secure  general-purpose  computing,  research  was  needed  to  develop 
data-oblivious  algorithms  and  data  structures  for  common  tasks  such  as  search  and  graph  algo¬ 
rithms.  Note  that  the  great  majority  of  data  structures  and  algorithms  commonly  used  in  practice 
are  not  data  oblivious,  while  naive  approaches  for  achieving  data-obliviousness  incur  a  substan¬ 
tial  increase  in  computation  time  over  best-known  solutions  (compare,  for  instance,  non-oblivious 
logarithmic-time  binary  search  with  oblivious  linear-time  scan). 

We  make  a  distinction  between  the  party  or  parties  who  hold  private  inputs  and  computational 
parties  who  conduct  the  computation.  This  allows  the  framework  to  be  used  in  many  contexts 
including  secure  joint  computation  by  multiple  parties  and  computation  outsourcing  by  one  or 
more  parties.  Our  techniques  are  information-theoretically  secure  and  promise  to  be  particularly 
efficient  and  suitable  for  large-scale  applications. 

To  foster  adoption  of  our  and  previously  developed  techniques,  another  goal  of  this  project 
was  to  build  a  compiler  that  translates  a  program  written  in  a  high-level  C-like  language  to  an 
executable  which  can  be  securely  evaluated  by  a  number  of  parties.  The  goal  was  to  support  as 
wide  of  a  range  of  functionalities  as  possible,  i.e.,  as  long  as  the  functionality  known  at  the  run-time, 
it  can  be  securely  evaluated  in  our  framework.  Performance  of  compiled  programs  was  intended 
to  be  evaluated  on  a  number  of  diverse  applications  including  statistical  analysis  and  biometric 
processing,  as  well  as  commonly  used  operations  and  data  structures,  which  is  of  high  relevance  to 
the  government,  military,  and  commercial  sectors. 

2  Project  Research  Results 

In  this  section  we  summarize  research  findings  of  the  project.  The  description  is  structured  accord¬ 
ing  to  the  three  main  thrusts  of  the  project,  which  are:  (i)  development  of  efficient  techniques  for 
securely  working  with  standard  data  types,  (ii)  designing  efficient  data-oblivious  algorithms  and 
data  structures  suitable  for  secure  computation  and  outsourcing,  and  (iii)  building  a  compiler  for 
translating  a  program  written  in  a  conventional  programming  language  which  is  intended  to  han¬ 
dle  private  data  into  the  corresponding  secure  distributed  implementation  that  provably  protects 
private  data  throughout  program  execution. 

Research  publications  associated  with  this  project  are  [4,  7,  13,  10,  11,  6,  5,  3].  All  of  them 
accomplish  support  from  this  research  grant. 
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2.1  Support  for  Secure  Processing  of  Standard  Data  Types 

Previously,  research  on  securely  handling  private  data  almost  entirely  focused  on  integer  operations. 
Secure  and  efficient  implementations  of  integer  arithmetic  could  also  be  used  to  implement  Boolean 
operations  and  string  manipulations  with  strings  represented  as  arrays  of  integer  values.  Support 
for  proper  floating-point  operations  on  private  real  numbers  was,  however,  lacking  and  closing  this 
gap  has  been  the  focus  of  this  research.  As  part  of  this  research,  we 

1.  designed  efficient  secure  multi-party  techniques  for  floating-point  computation  in  a  standard 
linear  secret  sharing  framework.  This  includes  a  variety  of  operations  such  as  addition, 
subtraction,  multiplication,  division,  comparisons,  rounding,  conversion  to  and  from  integers, 
and  supplemental  operations. 

2.  designed  efficient  (and  fast  converging)  secure  protocols  for  complex  operations  over  real 
numbers  such  as  square  root,  logarithm,  and  exponentiation. 

3.  evaluated  the  developed  and  existing  techniques  for  integer,  fixed  point,  and  floating  point 
arithmetic  and  demonstrated  efficiency  of  the  developed  protocols  despite  complexity  of  the 
operations. 

Details  of  the  design  and  implementation  are  available  from  [4] .  Consequently,  we  applied  this  design 
to  secure  two-party  computation  techniques  based  on  homomorphic  encryption  for  the  setting  where 
alternative  frameworks  are  not  an  option  [2].  In  our  further  research,  we  strengthened  the  security 
guarantees  of  the  constructions  to  be  resilient  to  adversarial  behavior  in  the  strongest  security 
model  through  a  number  of  novel  protocols  and  zero- knowledge  techniques  [3]. 

We  also  extended  our  work  on  private  and  data-oblivious  set  and  multiset  operations  [5]  and 
published  new  techniques  on  secure  and  verifiable  matrix  multiplication  outsourcing  [10]. 

2.2  Data-Oblivious  Algorithms 

Our  work  on  data-oblivious  algorithms  primarily  focused  on  graph  algorithms.  Graph  algorithms 
are  fundamental  in  computer  science  and  are  used  in  a  variety  of  applications.  Given  a  graph 
G  =  iy,E)  as  the  input,  our  solutions  use  an  adjacency  matrix  representation  of  the  graph.  This 
representation  has  size  0(|D|2)  and  is  asymptotically  optimal  for  dense  graphs  with  \E\  =  0(|C|2). 
We  developed  a  number  of  novel  data-oblivious  graph  algorithms  for  classical  graph  problems 
which  lead  to  secure  constructions  for  evaluating  such  problems  in  secure  multi-party  computation 
or  secure  outsourcing  settings.  In  particular,  we  designed  the  following  data-oblivious  algorithms: 

•  breadth-first  search  (BFS)  of  complexity  0( |G|2), 

•  single-source  single-destination  (SSSD)  shortest  path  of  complexity  0( |A|2), 

•  minimum  spanning  tree  of  complexity  0( |D|2), 

•  maximum  flow  of  complexity  0(|T|3|i?|  log(|V|)), 

•  and  maximum  matching  size  in  bipartite  graphs  of  complexity  0(|D|3  log(|  Cl)). 

The  details  of  our  techniques  are  available  from  [7,  6].  Our  research  also  treated  data-oblivious 
data  structures  as  described  in  the  next  section. 
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2.3  Compiler  for  Secure  Distributed  Computation 

As  part  of  the  third  thrust,  we  introduced  PICCO  (Private  distributed  Computation  COmpiler)  - 
a  system  for  translating  a  general-purpose  program  for  computing  with  private  data  into  its  secure 
implementation  and  executing  the  program  in  a  distributed  environment.  The  main  component 
of  PICCO  is  a  source-to-source  compiler  that  translates  a  program  written  in  an  extension  of 
the  C  programming  language  with  provisions  for  annotating  private  data  to  its  secure  distributed 
implementation  in  C.  The  C  language  was  chosen  due  to  its  popularity  and,  more  importantly, 
performance  reasons.  The  resulting  program  can  consequently  be  compiled  by  the  native  compiler 
and  securely  run  by  a  number  of  computational  nodes  in  the  cloud  or  similar  environment.  Besides 
the  compiler,  PICCO  includes  programs  that  aid  secure  execution  of  user  programs  in  a  distributed 
environment  by  preprocessing  private  inputs  and  recovering  outputs  at  the  end  of  the  computation. 

The  techniques  underlying  PICCO’s  secure  execution  build  on  a  threshold  linear  secret  sharing 
scheme  for  representation  of  and  secure  computation  over  private  values.  This  setting  was  chosen 
to  due  its  flexibility  (i.e.,  permitting  both  secure  multi-party  computation  and  secure  computation 
outsourcing)  and  speed.  Thus,  secure  execution  of  the  compiled  programs  is  performed  by  n  >  2 
computational  parties. 

The  compiler  supports  all  features  of  the  C  language  and  all  programs  that  do  not  result  in 
revealing  information  about  private  values.  For  example,  the  number  of  loop  iterations  in  a  program 
cannot  depend  on  private  values  as  this  information  cannot  be  revealed  even  at  program  runtime, 
data  flow  from  a  private  to  a  public  variable  is  not  permitted,  and  functions  executed  within  a 
conditional  statement  with  a  private  condition  are  not  permitted  to  have  public  side  effects.  The 
original  PICCO  design  in  [13]  did  not  support  the  use  of  C  pointers  (and  thus  features  such  as 
dynamic  memory  allocation),  and  this  limitation  has  been  consequently  mitigated  in  [11], 

Performance  of  secure  programs  compiled  with  PICCO  has  been  shown  to  be  fast  as  illustrated 
in  [13].  Our  consequent  work  that  treated  the  use  of  pointers  to  private  data  [11]  also  provides  ex¬ 
tensive  analysis  of  data  structures  built  via  traditional  pointer-based  implementations.  In  addition, 
the  compiler  has  already  been  used  to  build  implementations  that  securely  compute  with  private 
data  in  a  number  of  applications  such  as  processing  of  genomic  data  in  [12,  8]  and  evaluation  of 
statistical  tests,  which  will  be  available  in  a  forthcoming  technical  report. 
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provably  protects  private  data  throughout  program  execution.  This 
report  summarizes  the  research  findings  of  the  project  and  scientific 
advances  made  towards  each  of  the  research  thrusts  throughout  the 
project  duration. 
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